A Contextual Language Model to Improve Machine Translation of Pronouns by Re-ranking Translation Hypotheses

نویسندگان

  • Ngoc-Quang Luong
  • Andrei Popescu-Belis
چکیده

This paper addresses the translation divergencies of pronouns from English to French, specifically it and they, which have several gendered and non-gendered possible translations into French. Instead of using anaphora resolution, which is error-prone, we build a target language model that estimates the probabilities of a tuple of consecutive nouns followed by a pronoun. We bring evidence for the linguistic validity of the model, showing that the probability of observing a pronoun with a given gender and number increases with the proportion of nouns with the same gender and number preceding it. We use this French language model to re-rank the translation hypotheses generated by a phrase-based statistical machine translation system. While none of the pronoun-focused translation systems at the DiscoMT 2015 shared task improved over the baseline, our proposal achieves a modest but statistically significant improvement over it.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Translation of Power and Solidarity Pronouns in Qur’anic Rhetoric

  Translation of the Holy Quran can be difficult for translators in terms of accuracy and translatability. Sometimes translators fail to render the Quranic thoughts because of the lack of language features in target languages. This results in an unfavorable interpretation. One of the challenging aspects of translating Quran is reference switching as rhetorical devices, which are widespread i...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Machine Translation of Spanish Personal and Possessive Pronouns Using Anaphora Probabilities

We implement a fully probabilistic model to combine the hypotheses of a Spanish anaphora resolution system with those of a Spanish-English machine translation system. The probabilities over antecedents are converted into probabilities for the features of translated pronouns, and are integrated with phrase-based MT using an additional translation model for pronouns. The system improves the trans...

متن کامل

A Comparative Study of English-Persian Translation of Neural Google Translation

Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...

متن کامل

Improving Pronoun Translation for Statistical Machine Translation

Machine Translation is a well–established field, yet the majority of current systems translate sentences in isolation, losing valuable contextual information from previously translated sentences in the discourse. One important type of contextual information concerns who or what a coreferring pronoun corefers to (i.e., its antecedent). Languages differ significantly in how they achieve coreferen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016